Matching a tone-based and tune-based approach to English intonation for concept-to-speech generation
نویسندگان
چکیده
Tlle paper describes the results of a comparison of two annotation systems for isstoslal;ion, the tone-based ToBI al)proach and the 1;unebased api)roach proposed by Systemic Functi(mal Grammar (SFO). The goal of this comparison is to detine a mapping between the two systems tbr the purpose of concept-to-speech generation of English. Since ToB: is widely used in Sl)eech synthesis and SFG is widely used in nal;ural language generation and oft~rs a linguistically motivated aecollnt of intonation, it; appears a promising step to comt)ine the two approaches for concept-to-speech. A corpus of English utterances has been analysed with both ~].~()13I and SFG categories; eomparison of the analysis results has lead to the identification of some basic equivalents between the two systems on which a mapping can be based. 1 I n t r o d u c t i o n The pallet describes the main results of a con> parison of /;he ToB: (Tone-and-Break-Indices) ai)proach (Pierrehumbert, 1.9801 Silverman el; al.., 19961 to annotating English speech data with information about intonation and one of the British School approaches (e.g., Brazil et al. (1980)), Systenfie Fmmtional Grammar (SFO; (Halliday, 19671 Halliday, 1970)). The goal of this comparison is the definition of a mapping between the two systems. This attempt has a two-fbld motiw~tion. First, it is motivated by computational application in concept-to-si)eech systems, in which text in spoken mode is automatically generated from an underlying abstract lneaning representation, it is widely acknowledged that in order for spoken language technology to gain wider acceptance, it has to improve on the quality of output considerably. Itere, appropriate intonation is one of the major factors (ct'. Cole et al. (1995)). The concrete goal we are pursuing is to connect an oil-the-shelf speech synthesizer for English (FESTIVAL; (Black et al., 1998)) with an automatic text generation system tbr English based on SFO (Matthiessen & Bateman, 19911. Since in the SFO approach, intonation is accounted for as part of grammar rather than as an independent component, it is straightforward to extend the grammatical resources of a systemically based text generation system with an account of intonation (cf Teich et al. (1.997) iml)lenmnting such all approach for German concet/t-to-speech generation). Connecting such a system to a speech synthesizer requires mapping the OUtl)ut of the generator to the input requirements of the st)eech synth(> sizer. In the FESTIVAL systei11, the intonation of the text to be synthesized can be manipulated 1)y ~mnotation with TOBI labels. Therefore, a mapl)ing betweeIl the SFC and the ToBI annotation systems is required. Second, there is a theoretical lnotivation. With a mapping between tile ToBI and the slpo systems for intonation almotation, it will be possible to link the 1)honetic analysis of speech data to an interpretation of intonational meaning as it is proposed by SFO. Existing speech corpora that are acoustically analysed and annotated with ToBI tail then be used to test some of the assumptions brought forward by SFO about the natm:e of intonation. Also, with a mapping between ~oBI and SFG annotations, an exchange of annotated corpora between ToBI and SFO users would be possible. We report on the analysis of a sl)eech corpus compiled fl'om Halliday (1970) with ToBI and SFO labels (See. 3). The intonation analysis is based on an acoustic analysis of the speech data in terms of fundamental frequency (F0).
منابع مشابه
Matching a tone-based and tune-based approach to English intonation for concept-to-speech generation
متن کامل
Rules for the generation of ToBI-based American English intonation
This study presents an approach to the generation of American English intonation based on prescriptive rules that define the respective features of certain tone labels that in turn represent linguistically relevant F0 configurations. In accordance with the principles of the Tone Sequence Model the F0 contour is analyzed as a series of discrete target values that are connected by means of transi...
متن کاملNew Pseudo-CT Generation Approach from Magnetic Resonance Imaging using a Local Texture Descriptor
Background: One of the challenges of PET/MRI combined systems is to derive an attenuation map to correct the PET image. For that, the pseudo-CT image could be used to correct the attenuation. Until now, most existing scientific researches construct this pseudo-CT image using the registration techniques. However, these techniques suffer from the local minima of the non-rigid deformation energy f...
متن کاملLearning Intonation Rules for Concept to Speech Generation
In this paper, we report on an effort to provide a general-purpose spoken language generation tool for Concept-to-Speech (CTS) applications by extending a widely used text generation package, FUF/SURGE, with an intonation generation component. As a first step, we applied machine learning and statistical models to learn intonation rules based on the semantic and syntactic information typically r...
متن کاملBetter nonnative intonation scores through prosodic theory
Pronunciation scoring is one important task for software designed to give feedback to students practicing a second language. English intonation can convey information about a speaker’s nativeness, so previous studies have proposed using intonation-based models to score nonnative pronunciation. One past approach trained models for a set of pronunciation scores using ad hoc features derived from ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000